Individual Poster Page

See copyright notice at the bottom of this page.

Felipe Alou: Is He Afraid of the Walk?

November 14, 2002 - bob mong (www) (e-mail)

I would suggest that a much larger role for the manager is in selecting who is on the team and assigning roles to those players. As such, I think that Alou's club's walk totals are more telling than looking at a handful of players.
Perhaps an interesting study could be done:
Look for players on Alou's teams that he didn't give playing time to, yet deserved it based on their numbers once freed from Alou's control.
As an aside, I was just looking at the 1993 Expos' roster. It was amazingly young. Their were only two players on the entire team who was older than 30: Dennis Martinez (38 in 1993) and Randy Ready (33). Only four players were even 30 or older - Jeff Fassero (30), who pitched 150 pretty good innings in relief (mostly), Dennis Martinez, who pitched 230 about league-average innings as a starter, Bruce Walton (30), who pitched 6 innings, and Ready, the sole 30+ hitter, who split 40 games (150 AB) between 2B and 1B.
Every member of the starting lineup was 27 or younger. Three of them were 24 or younger. The starting lineup, a trio of 26-year-olds, was Moises Alou, Larry Walker, and Marquis Grissom.
John Vander Wal (27), Cliff Floyd (20), Rondell White (21), and Matt Stairs (25) were on the bench. With this amazingly young team, they won 94 games and were 2nd in the NL East.
Is it really any surprise, looking back now, that they were possibly the best team in baseball in 1994? The entire team, pretty much, hit its prime in 1994.

Felipe Alou: Is He Afraid of the Walk?

November 14, 2002 - bob mong (www) (e-mail)

I read MGL and Rauseo's comments here about mgl being flamed and treated poorly. Can someone tell me where to find that? I'd like to see what it was about.
I believe this is the thread.
MGL's original post. Flame #1. MGL's response. Flame #2. A more nuanced response to MGL's response.
All in all, I don't think the episode was that big of a deal, but maybe I am referencing the wrong situation (though I haven't seen anything else that would fit the bill).

Bonderman and Age 20 (June 26, 2003)

Discussion Thread

Posted 12:08 p.m., June 27, 2003 (#1) - bob mong(e-mail) (homepage)
Thanks for the link, Tango.

Anyone have any ideas about good ways to improve my so-called study and/or increase the sample size?

Also, can anyone fill in details about the six pitchers' careers, especially with regards to injury history?

Estimating Pitch Counts (July 2, 2003)

Discussion Thread

Posted 2:24 p.m., July 2, 2003 (#4) - bob mong
It's really silly to count a Pedro IP and a bad pitcher's IP the same.

I don't see the problem with this. I mean, IP is just a proxy for outs, which is a possible plate appearance outcome, just like hit, walk, or strikeout. How is a groundout to 2B different, whether Pedro induces it or Lima does?

Actually, I am assuming you are right about this, since you usually are; what am I missing?

Estimating Pitch Counts (July 2, 2003)

Discussion Thread

Posted 7:47 p.m., July 3, 2003 (#6) - bob mong
Something else I have been wondering:

In the past, pitchers hit better, as a whole, than they do now. Should that effect our estimates of historical pitch counts?

League Equivalency (July 2, 2003)

Discussion Thread

Posted 11:50 a.m., July 3, 2003 (#3) - bob mong
That means the group of players changing leagues is unlikely to be similar to the group of players not changing leagues.

Might it be better to do a matched-pair study? Match each league-switching player with a non-league-switching player of the same age - you might want to try to match for playing time and quality as well, I don't know.

10 runs = 1 win? (August 1, 2003)

Discussion Thread

Posted 4:44 p.m., August 7, 2003 (#1) - bob mong
This is interesting...but what is the context? Where did the formula
WPCT = 1/2 + (RS-RA)/10
come from?

And what is the meaning of the title, "10 runs = 1 win?"

I suppose I am just out of the loop here - but I would like to be in the loop!

Extended Pitch Count Estimator (August 4, 2003)

Discussion Thread

Posted 5:16 p.m., August 5, 2003 (#1) - bob mong
You gave these formulas (formulae?):

Pitches per BIP = 2.5 x [(1 - BIP rate) ^ 0.08] + 1
Pitches per BB = 1.5 x [(1 - BIP rate) ^ 0.10] + 4
Pitches per K = 1.9 x [(1 - BIP rate) ^ 0.07] + 3

A couple of questions/comments:

First, where did the 2.5, 1.5, and 1.9 #s come from?
Second, where did the 0.08, 0.10, and 0.07 #s come from?
Third, you are using the assumption that a BIP rate of 100% implies 1 pitch per PA - though I admit that this kind of makes intuitive sense, it still seems a big step. What is your justification for this assumption?

Extended Pitch Count Estimator (August 4, 2003)

Discussion Thread

Posted 11:57 a.m., August 6, 2003 (#3) - bob mong(e-mail) (homepage)
That kinda makes sense...but I am still not buying it.

Here's why:

Assumption 1a: There are no swinging strikeouts. From this it must follow that a batter can make contact with every strike.
Assumption 1b: There are no called strikeouts. From this it must follow that there is a big enough incentive for the batter to avoid strikeouts that he will swing at every strike when he has two strikes on him already.

Assumption 2: There are no walks. From this it must follow that the pitcher can throw a strike every time, if he so chooses. And furthermore, there must be a big enough incentive for the pitcher to avoid walks that he WILL choose to throw a strike every time there are three balls.

Certain things follow from these assumptions:

From (1a) the batter knows that under no circumstances can he be struck out. Even if he has two strikes on him, he can always make contact with every possible third strike. So why not take/foul-off a few pitches, wait for the one that is the fattest?

From (2) the pitcher knows that under no circumstances will he give up a walk. Even if the count has three balls, he can always throw a strike. So why now throw a couple of bad pitches early in the count, hoping to get the batter to chase them?

Here is an example of where this situation could exist:

Imagine a slow-pitch softball league of very high quality (where the relevant rules (4-ball walks, 3-ball Ks, etc.) are the same as MLB baseball).

In slow-pitch softball, nobody strikes out, for obvious reasons (those given in (1a) and (1b) above: it is so damn easy to hit a strike and so shameful to strikeout that nobody ever does).

Since we are assuming a very high quality of play, let's also assume that the pitchers are good enough that there are no walks (or so close to zero that it is basically the same).

So we have a league where nobody strikes out and nobody walks - in this league, will the batters always swing at, and put in play, the first pitch they see?

I doubt it - I think that there will be a non-trivial number of foul balls, plus I think batters will be willing to take a pitch or two in order to get the fattest pitch possible, even if they have to go to two-strikes to do so. And the pitcher may be willing to try to get the batter to chase a few bad pitchers, knowing that he has four balls to work with, and that he can always come back with a strike and depend on his defense.
...
I don't really think this affects your model at all (since your model isn't really concerned with that extreme end-point), but I think that it does break down at that end point.

DIPS year-to-year correlations, 1972-1992 (August 5, 2003)

Discussion Thread

Posted 12:32 p.m., August 13, 2003 (#85) - bob mong
Great stuff, everyone!

A few minor notes:

Alan Jordan wrote: You said -
The standard deviation is given by
STD = sqrt(n*p*(1-p))

This is wrong. STD =Sqrt(P*(1-P)). N doesn’t come in to play.

Actually, you are both wrong. First of all, that equation for STD is wrong; sample size does matter. To quote from my statistics text book ("Applied Statistical Methods," Carlson and Thorne, 1997):

Page 49 (in the box): "The sample standard deviation, s, and the population standard deviation, σ, are defined as

s = squareroot(s²) = squareroot( (sum[over i] (x_i - X_avg)²) / (n - 1) )

and

σ = squareroot(σ²) = squareroot ( (sum[over i] (x_i - µ)²) / N)"

Where x_avg is the sample mean, n is the sample size, µ is the population mean, and N is the population size.

This is applicable for any and all distributions; it is the definition of standard deviation. Notice that the sample size is, indeed, a factor.

For specific distributions, there are formulas that you can use to eliminate all that tedious summing, squaring, and square-rooting (is that a word?) -

For example, on page 189, in the box, is given the variance of the binomial distribution (the standard deviation, as I am sure you are all aware, is the square root of the variance):

"and the variance is

σ²_x = nπ(1-π)"

Where "n is the number of independent Bernoulli trials and π is the probability of success for each Bernoulli trial."

Which would imply that the standard deviation, assuming a binomial distribution, is:

squareroot(nπ(1-π))

However, Tango was right when he said that the numerator doesn't matter, only the denominator. The sample size matters, not the number of successes.

And furthermore, Tango was also right when he wrote that the closer you are to 0.5, the larger the standard deviation. That follows from the formula:

π, π × (π - 1)
0.1, 0.09
0.2, 0.16
0.3, 0.21
0.4, 0.24
0.5, 0.25
0.6, 0.24
0.7, 0.21
0.8, 0.16
0.9, 0.09

As the probability gets further from 0.5, the standard deviation will become smaller, given identical sample-sizes. That is why the standard deviation of the out is smaller than the STD of 1B, and that is why the STD of XBH is smaller than 1B.

Make sense?

Exactly How Full of S is OPS? (August 6, 2003)

Discussion Thread

Posted 12:10 p.m., August 6, 2003 (#1) - bob mong(e-mail) (homepage)
Who is this guy? That article is fantastic!

Well written, humorous, informative, and easy to read - get him to write some stuff for Primer!

Empirical Win Probabilities (August 28, 2003)

Discussion Thread

Posted 4:08 p.m., August 28, 2003 (#1) - bob mong
Thanks for posting this, Tango.

Road Warriors (September 4, 2003)

Discussion Thread

Posted 2:40 p.m., September 4, 2003 (#1) - bob mong
Here is some more data: Teams with a home-field advantage (HFA) of at least 8 wins.

Team, HFA

2002
Twins, +14
Red Sox, -9
Anaheim, +9
White Sox, +13
Royals, +12
Diamondbacks, +12
Expos, +15 (they didn't play in PR in 2002)
Astros, +10

2001
Twins, +9
White Sox, +11
Braves, -8
Phillies, +8
Cardinals, +15
Cubs, +8
Giants, +8

2000
Mets, +16
Giants, +13
Diamondbacks, +9

This isn't complete for all teams in these years.

Road Warriors (September 4, 2003)

Discussion Thread

Posted 4:35 p.m., September 4, 2003 (#4) - bob mong(e-mail) (homepage)
I just updated the blog entry, giving data for every team for every season between 1996 and 2002.

The most surprising to me was Florida - anybody have any ideas why they have such a large home-field advantage (especially considering that Tampa Bay does not).

Also, it appears that home-field advantage in the AL is different than HFA in the NL. I have no idea why, other than the outliers in the NL (Colorado, Florida).

Comments?

Oh, and thanks for posting this, Tango :)

Road Warriors (September 4, 2003)

Discussion Thread

Posted 4:42 p.m., September 4, 2003 (#5) - bob mong
I don't think you can use a team's season road record to predict how they will perform in the playoffs against a few particular teams.

I dunno. I agree that the sample size is very small, and the results are inconclusive. But...

Imagine Florida, who has a 7-year HFA of 12 wins (3-year average: also 12 wins), is playing Atlanta, who has a 7-year HFA = 5 wins (3-year HFA = 1 win) in the playoffs.

Both of those teams have similar HFA in 2003 to their historical HFA (Florida 2003 HFA: +15, Atlanta 2003 HFA: +5).

I would think that home-field advantage would be very important for Florida, as well as for any team playing against Florida.

Growing Pains: Fundamental baseball (September 7, 2003)

Discussion Thread

Posted 1:44 p.m., September 8, 2003 (#1) - bob mong
One thing to add to his list would be double-plays vs. double-play opportunities. That, it seems to me, would fall in the "fundamentals" category.

Also interesting that Seattle ends up on top of his particular list. They are, mostly, an older team defensively:

C: 34 (Wilson) & 26 (Davis)
1B: 34 (Olerud)
2B: 34 (Boone)
SS: 27 (Guillen) & 35 (Sanchez)
3B: 33 (Cirillo), 27 (Guillen), 25 (Bloomquist), & 38 (McLemore)
LF: 29 (Winn)
CF: 30 (Cameron)
RF: 29 (Ichiro!)

Starters: 40 (Moyer), 27 (Garcia), 30 (Franklin), 24 (Piñeiro), 24 (Meche)
Relievers: 35 (Sasaki), 34 (Hasegawa), 33 (Rhodes), 36 (Nelson), 23 (Soriano), 25 (Mateo), 30 (Benitez)

The only defensive regulars under 27 are half-timer Ben Davis and one-time-3B Willie Bloomquist (who is now a backup UT again).
The only pitchers under 27 are Piñeiro, Meche, Soriano, and Mateo.

By The Numbers - Sept 7 (September 8, 2003)

Discussion Thread

Posted 3:37 p.m., September 8, 2003 (#2) - bob mong
Good stuff - except for that pitch-count estimator. That's a load of crap :)

Magic Numbers - RIOT -- Playoff Races (September 9, 2003)

Discussion Thread

Posted 11:28 a.m., September 12, 2003 (#1) - bob mong
Nobody has commented on this stuff, but I will: I think this is pretty cool.

Clemens' turnaround? (September 15, 2003)

Discussion Thread

Posted 6:01 p.m., September 15, 2003 (#2) - bob mong
Tango, when you say, "Who is Roger Clemens' closest comp?" are you just eyeballing it, or are you using a statistical comparison?

Sabermetrics >WIN SHARES bibliography (September 19, 2003)

Discussion Thread

Posted 7:34 p.m., September 25, 2003 (#8) - bob mong(e-mail)
I think Sean Smith has the basketball win shares somewhere...

If anyone knows where, please let me know!

Most pitches / game in a season (September 22, 2003)

Discussion Thread

Posted 7:29 p.m., September 22, 2003 (#3) - bob mong
I am assuming, Tango, that this is based on your pitch-count estimatation formula (for the older seasons, anyway). Is that correct, or do you have some crazy source?

Aging patterns (September 23, 2003)

Discussion Thread

Posted 6:21 p.m., September 25, 2003 (#16) - bob mong
I tried to sorta replicate this using the Lahman db, Tango, and I got mostly the exact same figures, except:

I got $H peaking between 23 & 26 and not falling nearly as far; reaching a low of .88 at age 39 (you show, a low of .72 at age 39).

How did you get your numbers?

Aging patterns (September 23, 2003)

Discussion Thread

Posted 12:05 p.m., September 30, 2003 (#25) - bob mong
Since I built a whole spreadsheet last week using the chaining, I will give a stab at explaining:

Say you know a guy's AB, BB, 1B, 2B, 3B, HR, SO, and SB for his age 25 season.
To predict his age-26 season, given his plate appearances (AB+BB) in his age-26 season, you would:

Figure his previous age-25 $BB, figure out the new rate (multiply the old $BB by .82/.84, in this case), and then, to get age-26 ABs and age-26 BBs, solve this equation:

$BBnew = (BBnew)/(ABnew) = (BBnew)/(PAnew - BBnew)

rearranging...

$BBnew * (PAnew - BBnew) = BBnew

($BBnew * PAnew) - ($BBnew * BBnew) = BBnew
$BBnew * PAnew = BBnew + ($BBnew * BBnew)
$BBnew * PAnew = BBnew * (1 + $BBnew)

Which gives us (if I did everything correctly)...

BBnew = ($BBnew * PAnew)/(1 + $BBnew)

Everything on the right side we already know, so we can find his age-26 BB (denoted as BBnew). Since we know that PA = AB + BB, then we can find his age-26 AB as well.

Now, if you look at the next rate ($K), you can see that it only uses strikeouts and AB. Since we know his new AB total, we can similarly figure his new K total, and onward and upward, etc.

Did this make sense?

Tango, when you ran your numbers, did you adjust for league-context?

Aging patterns (September 23, 2003)

Discussion Thread

Posted 12:50 p.m., September 30, 2003 (#27) - bob mong
Thanks, Tango.
Sorry to keep badgering you about this; I was just curious (mostly to know if my numbers are way off, indicating that I totally screwed up somewhere). :)

2003 Park Factors (October 1, 2003)

Discussion Thread

Posted 3:52 p.m., October 2, 2003 (#12) - bob mong(e-mail) (homepage)
I have a general park factors question:

With the unbalanced schedule and the outlier status of Coors field, is it accurate to compare AL park factors directly to NL park factors?

For example, check this out:

Park Factors of NL West:
Year: LA PF, SD PF, SF PF, COL PF, ARI PF
1988: 97, 98, 96 (Dodger Stadium, Qualcomm/Jack Murphy Stadium, Candlestick)
1989: 98, 100, 97
1990: 97, 101, 96
1991: 98, 103, 97
1992: 98, 102, 94 (due to three-year average, first effects of Coors are felt here)
1993: 96, 103, 96, 120 (Coors opens - note that, due to 3-year average, the full effect of Coors is not yet felt)
1994: 93, 97, 95, 116 (Full effect of Coors)
1995: 91, 96, 96, 128 (Note steady decline of Dodger Stadium)
1996: 92, 95, 96, 129
1997: 92, 93, 98, 123 (interleague play begins, first effects of the BOB are felt here)
1998: 93, 91, 96, 119, 101 (BOB opens. Full effects of BOB not yet felt)
1999: 96, 96, 89, 129, 96 (Full effects of BOB are felt)
2000: 92, 91, 91, 131, 102 (first effects of unbalanced schedule are felt: notice drop in all pitchers' parks. PacBell opens)
2001: 90, 91, 91, 122, 106 (Unbalanced schedule begins. Full effects not yet felt)
2002: 91, 92, 91, 121, 108 (This year is two-year average of 2001 & 2002. Full effects of unbalanced schedule are felt).

Notes:
Dodger Stadium had no changes in dimension between 1988 and 2002 (all ballpark info from ballparks.com). Yet, its park factor dropped from 97 or 98, between 1988 and 1991, to 90 or 91, between 2001 & 2002.
Jack Murphy Stadium (later Qualcomm) had no changes in dimension between 1988 and 2002 (except for very minor changes to the RF foul line: from 327 to 330 in 1996). Yet it went from a mild hitters' park, 1988-1993, to a severe pitchers' park, 2000-2002.
Both of these parks are in southern California, where temperature extremes and inclement weather are rare - it seems unlikely that this change could be from more rainy/cold weather.
Candlestick (later 3com) had no changes in dimension between 1988 and 1999 except for minor changes, also to the RF foul line: from 335 to 330 in 1991, and then to 328 in 1993. It seems to have had a fairly stable park factor.

My hypothesis, which I can't back up, more than you just read, is this: Coors Field, and the unbalanced schedule (especially in combination) skews the park factors of NL West clubs so much that they cannot be directly compared to AL clubs (or possibly to NL clubs in other divisions). In other words, just because PacBell had a PF of 91 in 2002 does NOT indicate that it was more of a pitchers' park than Safeco Field in 2002 (PPF=94).

Comments? Does this make sense to anyone else?

2003 Park Factors (October 1, 2003)

Discussion Thread

Posted 5:00 p.m., October 2, 2003 (#14) - bob mong
Bob, Not that it affects the larger point, but the first few years for the Rockies were played in old Mile High Stadium, not Coors Field.

Thanks for the reminder, my mistake.

RISP for hitters and pitchers (October 13, 2003)

Discussion Thread

Posted 7:06 p.m., October 13, 2003 (#1) - bob mong
I agree with you 100%, Tango - I just read this and was abouts to email you asking you to post it for Primate Study - but you beat me to it :)

I don't really see how the BP Boyz can tout ARP endlessly when discussing relievers, on the one hand, and completely ignore situational hitting when discussing hitters (or, to complete the analogy, pinch hitters), on the other hand.

I can see disregarding RBI, since it is a counting stat - but if ARP is so great, why don't they compile and present AVG/OBP/SLG for all runners in all base-out states, and rank hitters based on their ability to hit with RISP, for example?

This came up a bunch when the Mariners traded Jeff Nelson for Armando Benitez; Jeff Nelson, despite a reasonably pretty ERA and peripherals at the time of the trade, had a gaudy ARP (it was negative, I believe) - so lots of people argued that the Mariners got a far better reliever, since the ARP totals were so different. I argued that no one had ever shown that ARP has any predictive ability, and that traditional methods of evaluating pitchers (ERA, K:BB, etc) could certainly be applied to relievers as well.

Anatomy of a Collapse (October 15, 2003)

Discussion Thread

Posted 4:29 p.m., October 15, 2003 (#8) - bob mong
Great stuff, Tango! Breaking down a little further:

The tally:
Prior + SS = +.076
Prior + Alou = +.051
Remlinger = +.001
Remlinger + Fielders = -.016
Dusty = -.017
Fan = -.031
Prior = -.051
Gonzalez = -.184
Farns + OF = -.271
Prior + OF = -.476

Implies:
Prior: -0.226 (25%)
Alex Gonzalez (SS): -0.147 (16%)
Farnsworth: -0.1355 (15%)
Lofton: -0.1255 (14%)
Sosa: -0.1255 (14%)
Alou: -0.1 (11%)
Fan: -0.031 (3%)
Dusty: -0.017 (2%)
Remlinger: -0.007 (1%)
All other fielders (1B, 2B, 3B, C): -0.001 each (0% each)

Prior For Goat!

Anatomy of a Collapse (October 15, 2003)

Discussion Thread

Posted 4:43 p.m., October 15, 2003 (#12) - bob mong
given that it was Dusty's call to leave Prior in, I would vote Dusty for goat...

I dunno - you can't blame the manager for everything bad that happens on the field...but I think you have a point in this case. Giving half of Prior's WE to Dusty puts Prior at -0.113 (13%) and puts Dusty at -0.13 (14%) - still below Gonzalez and Farnsworth.

Gonzalez For Goat?

Managers Post-season records (October 22, 2003)

Discussion Thread

Posted 3:47 p.m., October 22, 2003 (#2) - bob mong
Thanks, Tango.

I thought it was interesting that a few of the managers that statheads/primates really like have awful postseason records.

Namely, Showalter and Dierker. Of course, small samples and all that, so it probably doesn't mean anything. But still, together they have won zero postseason series and lost 6, winning 5 games and losing 18. Ouch.

Managers Post-season records (October 22, 2003)

Discussion Thread

Posted 7:29 p.m., October 22, 2003 (#4) - bob mong
Anybody else have thoughts on who the best postseason manager of all time is, and whether that matters?

Gleeman - Game 4 (October 23, 2003)

Discussion Thread

Posted 1:21 p.m., October 23, 2003 (#2) - bob mong
Hardcore sabermetrics...

Tango's tryin' to make this blog live up to its name, I guess. :)

Cities with best players (October 23, 2003)

Discussion Thread

Posted 2:23 p.m., October 23, 2003 (#8) - bob mong
We probably should agree on at least some semblance of a top-ten list for baseball, basketball, football, and hockey before we go too much farther. Or not.

Your complete resource for Fielding (October 23, 2003)

Discussion Thread

Posted 1:24 p.m., October 23, 2003 (#1) - bob mong
While we are talkin' fielding and playoffs and whatnot...does anybody have playoff/postseason fielding stats and/or analysis? To invoke the name that launched a thousand tirades, does Jeter take his fielding up a notch in the postseason? More seriously, it would be interesting to see some kind of analysis of fielding, on a team level, at least. Hmmmm...I suppose I could do some of that...back later, maybe.

Results of the Forecast Experiment, Part 2 (October 27, 2003)

Discussion Thread

Posted 3:56 p.m., October 27, 2003 (#14) - bob mong
Tango, could you explain the Monkey? If I remember correctly, it was described, in the original article, as a three-year weighted average with some age adjustments - would you mind sharing the details?

Eck (December 10, 2003)

Discussion Thread

Posted 5:52 p.m., December 12, 2003 (#1) - bob mong(e-mail) (homepage)
This is interesting stuff. I think Ecks case for the HOF is kinda borderline, myself (and I'm fairly sure I'm in the minority among stat-head evaluators).

Here's how I see it: He was a decent starter. Approximately a 151-128 (.541) record as a starter*, with a 3.67 ERA (111 ERA+), with about 1600 strikeouts, 600 walks, 268 HR, and 2400 hits in about 2500 IP. Those don't strike me as HOF numbers, to tell you the truth. Nolan Ryan got in the HOF with approximately the same ERA+ and a worse W/L %, but he also won 300 games, struck out 5700 batters, and pitched over twice as many innings. Ecks starting numbers are basically the same as Jamie Moyer's career numbers.

As a reliever...well, Tango covered that somewhat above. I won't go into my whole argument right now; I'll just say: Eck, as a reliever, had 4 superlative seasons (1988, 1989, 1990, 1992), 3 very good seasons (1987, 1991, 1996), and four average seasons (1993, 1994, 1997, 1998), and one bad season (1995). That relieving career, taken by itself, doesn't strike me as a HOF career either - a little short, maybe, with too many mediocre years in there. Bruce Sutter has a similar career (eight very good or superlative years and 4 bad years), though he averaged 100 IP/season in his good years while Eck averaged 77 IP/season In any case, Sutter isn't in the HOF (though his case is certainly much, much better than Jamie Moyer's).

To me, then, his career as a starter, taken by itself, doesn't merit inclusion in the HOF and his career as a reliever, taken by itself, barely merits inclusion in the HOF, if it does at all.

So, the question is: Do two careers that, by themselves, fall short of the HOF, equal one single HOF career when crammed into one career?

I haven't yet figured out an answer to that question - so I don't know yet whether I think Eck should be in the HOF or not. Right now I am leaning towards exclusion, but its hard to get a handle on the worth of relievers (LI helps, though :)

*For these stats, I treated his pre-1987 career as exclusively a starter and his post-1986 career as exclusively a reliever.

Diamond Mind Baseball - Gold Glove Winners (December 11, 2003)

Discussion Thread

Posted 12:02 p.m., December 15, 2003 (#23) - bob mong(e-mail) (homepage)
Thanks for the sneak peak, MGL! Fun to read and think about.

I thought this was interesting (from the article): "If Dan Wilson (92 starts) didn't share the position with Ben Davis, he'd get my vote. He was part of the duo that led the league in fewest steals allowed, he led the league in fielding percentage (only one error), and shared the lead in fewest passed balls allowed among catchers with at least 800 innings. But it's hard to pick a guy who caught only 57% of his team's innings, so I'll concur with the voters and give the nod to Molina."

I thought this was interesting because the guys at U.S.S. Mariner have often said that Dan Wilson is very overrated defensively. So I emailed them and asked them what they thought of Tippett's comment, and Derek emailed me back and said (paraphrase) that since Dan Wilson caught Moyer almost exclusively last year, that throws off the SB/CS numbers (and possibly the other numbers, too), since Moyer is left-handed, controls the running game very well, and also has good control and therefore doesn't throw a lot of balls in the dirt or past the catcher. What do you guys think? MGL, what does UZR say about Wilson? Does catching most of Moyer's innings skew Wilson's numbers up (and, Ben Davis' numbers down)?

Clutch Hitters (January 27, 2004)

Discussion Thread

Posted 3:27 p.m., January 27, 2004 (#1) - bob mong
Man, Cirillo has sucked.

Sheehan: Foulkelore (January 29, 2004)

Discussion Thread

Posted 5:19 p.m., January 29, 2004 (#2) - bob mong(e-mail) (homepage)
Thanks for your great, as usual, look at mini-studies. I read this article yesterday and was immediately annoyed: Aren't the results (I thought to myself) he presented exactly consistent with every baseball player's (or at least every pitcher's) aging pattern? He didn't show any evidence that relievers were worse than anybody else.

So anyway, I pulled similar data for starting pitchers since 1980, and got basically the same results.

So, here's the data, in case anybody is interested, for starting pitchers with the most WARP between ages 26 and 30 (only for pitchers whose age-26 season occurred in 1980 or later):

First, here are the fifteen pitchers.

Greg Maddux
Pedro Martinez
Roger Clemens
Orel Hershiser
Frank Viola
Mike Mussina
Mark Langston
Teddy Higuera
Jack McDowell
Tom Glavine
Chuck Finley
Dave Stieb
Randy Johnson
Jack Morris
David Cone

No real surprises? Greg Maddux was worth 53.9 WARP3 from age 26 to age 30, David Cone was worth 32.7.

Here are the age-by-age averages:

26: 7.7
27: 8.5
28: 8.0
29: 7.5
30: 7.9
31: 6.7
32: 6.2
33: 5.5
34: 5.3

Total from 26-30: 39.6
Average from 26-30: 7.9

Smack the Pingu (January 29, 2004)

Discussion Thread

Posted 12:56 p.m., January 30, 2004 (#8) - bob mong
1334.1

Peak Age by Year of Birth (February 11, 2004)

Discussion Thread

Posted 4:26 p.m., February 12, 2004 (#2) - bob mong (homepage)
Thanks for the link, Tango. Didn't weight squat, no minimum career length, no nothing. Total data dump, in graphical form :)
Pretty crude.

Ideas on better ways to do this? What would be a good minimum career length?

Also, it is interesting that the age has been increasing over the past 15 years...but also interesting that it has been increasing to the same, or only a slightly higher, level than that of players born between ~1908 and ~1920. So my question is: why the dip for players born between 1924 and 1948? Various wars? Something else?

Diamond Mind Baseball - Team Forecast Results (February 12, 2004)

Discussion Thread

Posted 12:16 p.m., February 13, 2004 (#1) - bob mong (homepage)
How about the collective wisdom of the Primates?

Check out our collective prediction.

Here's what we thought:
AL East: BOS, NYY, TOR, BAL, TBD (score: 2)
AL Central: MIN, CHW, CLE, KCR, DET (score: 2)
AL West: OAK, SEA, ANA, TEX (score: 0)

NL East: PHI, NYM, ATL, MON, FLA (score: 26)
NL Central: HOU, STL, CHC, CIN, PIT, MIL (score: 8)
NL West: SFG, ARI, LAD, COL, SDP (score: 2)

Using the DMB scoring system, we scored a 40. Not too bad, I don't think. But wait! If you look back at the list of predictions, you can see that somebody hacked the system! Somebody entered 20 predictions (#s 523-542), all with the same prediction: Everybody finishes with 81 wins except the Yankees, who finish with zero and the Mets, who finish with 199.

If we factor out these 20 obviously silly predictions, then here are the revised standings (changes bolded):
AL East: NYY, BOS, TOR, BAL, TBD (new score: 0)
AL Central: MIN, CHW, CLE, KCR, DET (score: 2)
AL West: OAK, SEA, ANA, TEX (score: 0)

NL East: PHI, ATL, NYM, MON, FLA (new score: 14)
NL Central: HOU, STL, CHC, CIN, PIT, MIL (score: 8)
NL West: SFG, ARI, LAD, COL, SDP (score: 2)

So the actual Primate score is...26! If Scott and Tippett had seen fit to include us, we would have placed 2nd. We beat Vegas, we beat DMB, we beat everybody except the LA Times. Good job everybody (except the hacker)!

More Help Requested (March 4, 2004)

Discussion Thread

Posted 11:55 a.m., March 5, 2004 (#4) - bob mong (homepage)
I agree with Alan's last sentence. I think it is the most important point to consider.

Whether or not you throw away any ballots, you should perform your analysis on the entire sample AND your modified sample, and note any differences.

For my job, I read a lot of articles from medical journals, almost all of which are the results of a clinical trial. Almost without exception, every single article analyzes the results using what they call "intent-to-treat analysis." This means that, whether a particular patient actually received the treatment they were supposed to, or any treatment at all, all patients who were initially included in the clinical trial are analyzed as if they had received the treatment they were supposed to receive.

Now, this isn't directly analogous, since if you don't do that in clinical trials you can get some serious selection issues (i.e., maybe 90% of the patients who didn't end up undergoing any treatment did so because they were too ill to do so), and a similar problem doesn't exist for your sample. But, all the same, I think it is best to use the sample you have collected for your analysis. One bad/goofy datapoint out of thirty won't affect any end-analysis too much, I wouldn't think.

The 2004 Marcels (March 10, 2004)

Discussion Thread

Posted 3:35 p.m., March 10, 2004 (#6) - bob mong
Just to clarify: You have not made any park adjustments at any point in the calculations, correct?

The 2004 Marcels (March 10, 2004)

Discussion Thread

Posted 11:39 a.m., March 11, 2004 (#14) - bob mong (homepage)
Posted 3:35 p.m., March 10, 2004 (#6) - bob mong
Just to clarify: You have not made any park adjustments at any point in the calculations, correct?

Posted 3:42 p.m., March 10, 2004 (#7) - tangotiger
Correct. I did only and exactly what I listed above.

I just brought this up because it will affect projections for some players more than others - namely, players who are going from pitchers' parks to hitters' parks or vice versa. Like Alex Rodriguez, Alfonso Soriano, anybody coming or going to Colorado, etc. Just something to keep in mind.

2004 Team Previews Around the Web (March 10, 2004)

Discussion Thread

Posted 11:35 a.m., March 11, 2004 (#1) - bob mong (homepage)
"At Home Plate" has a bunch of team previews up. Check their main archive. They have previews for the Braves, the Angels, the Diamondbacks, the Cardinals, the Blue Jays, the Tigers, the Orioles, and the Padres.

2004 Team Previews Around the Web (March 10, 2004)

Discussion Thread

Posted 7:06 p.m., March 12, 2004 (#5) - bob mong (homepage)
The Royals must bring out the inner poet in previewers. :)

Copyright notice

Comments on this page were made by person(s) with the same handle, in various comments areas, following Tangotiger © material, on Baseball Primer. All content on this page remain the sole copyright of the author of those comments.

If you are the author, and you wish to have these comments removed from this site, please send me an email (tangotiger@yahoo.com), along with (1) the URL of this page, and (2) a statement that you are in fact the author of all comments on this page, and I will promptly remove them.